Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available June 10, 2026
-
Free, publicly-accessible full text available May 26, 2026
-
Typosquatting—the practice of registering a domain name similar to another, usually well-known, domain name—is typically intended to drive traffic to a website for malicious or profit- driven purposes. In this paper we assess the current state of typosquatting, both broadly (across a wide variety of techniques) and deeply (using an extensive and novel dataset). Our breadth derives from the application of eight different candidate-generation techniques to a selection of the most popular domain names. Our depth derives from probing the resulting name set via a unique corpus comprising over 3.3B Domain Name System (DNS) records. We find that over 2.3M potential typosquatting names have been registered that resolve to an IP address. We then assess those names using a framework focused on identifying the intent of the domain from the perspectives of DNS and webpage clustering. Using the DNS information, HTTP responses, and Google SafeBrowsing, we classify the candidate typosquatting names as resolved to private IP, malicious, defensive, parked, legitimate, or unknown intents. Our findings provide the largest-scale and most-comprehensive perspective to date on typosquatting, exposing potential risks to users. Further, our methodology provides a blueprint for tracking and classifying typosquatting on an ongoing basis.more » « lessFree, publicly-accessible full text available May 26, 2026
-
Typosquatting—the practice of registering a domain name similar to another, usually well-known, domain name—is typically intended to drive traffic to a website for malicious or profitdriven purposes. In this paper we assess the current state of typosquatting, both broadly (across a wide variety of techniques) and deeply (using an extensive and novel dataset). Our breadth derives from the application of eight different candidate-generation techniques to a selection of the most popular domain names. Our depth derives from probing the resulting name set via a unique corpus comprising over 3.3B Domain Name System (DNS) records. We find that over 2.3M potential typosquatting names have been registered that resolve to an IP address. We then assess those names using a framework focused on identifying the intent of the domain from the perspectives of DNS and webpage clustering. Using the DNS information, HTTP responses, and Google SafeBrowsing, we classify the candidate typosquatting names as resolved to private IP, malicious, defensive, parked, legitimate, or unknown intents. Our findings provide the largest-scale and most-comprehensive perspective to date on typosquatting, exposing potential risks to users. Further, our methodology provides a blueprint for tracking and classifying typosquatting on an ongoing basis.more » « lessFree, publicly-accessible full text available May 26, 2026
-
Free, publicly-accessible full text available May 7, 2026
-
Software security depends on coordinated vulnerability disclosure (CVD) from researchers, a process that the community has continually sought to measure and improve. Yet, CVD practices are only as effective as the data that informs them. In this paper, we use DScope, a cloud-based interactive Internet telescope, to build statistical models of vulnerability lifecycles, bridging the data gap in over 20 years of CVD research. By analyzing application-layer Internet scanning traffic over two years, we identify real-world exploitation timelines for 63 threats. We bring this data together with six additional datasets to build a complete birth-to-death model of these vulnerabilities, the most complete analysis of vulnerability lifecycles to date. Our analysis reaches three key recommendations: (1) CVD across diverse vendors shows lower effectiveness than previously thought, (2) intrusion detection systems are underutilized to provide protection for critical vulnerabilities, and (3) existing data sources of CVD can be augmented by novel approaches to Internet measurement. In this way, our vantage point offers new opportunities to improve the CVD process, achieving a safer software ecosystem in practice.more » « less
-
Software security depends on coordinated vulnerability disclosure (CVD) from researchers, a process that the community has continually sought to measure and improve. Yet, CVD practices are only as effective as the data that informs them. In this paper, we use DScope, a cloud-based interactive Internet telescope, to build statistical models of vulnerability lifecycles, bridging the data gap in over 20 years of CVD research. By analyzing application-layer Internet scanning traffic over two years, we identify real-world exploitation timelines for 63 threats. We bring this data together with six additional datasets to build a complete birth-to-death model of these vulnerabilities, the most complete analysis of vulnerability lifecycles to date. Our analysis reaches three key recommendations: (1) CVD across diverse vendors shows lower effectiveness than previously thought, (2) intrusion detection systems are underutilized to provide protection for critical vulnerabilities, and (3) existing data sources of CVD can be augmented by novel approaches to Internet measurement. In this way, our vantage point offers new opportunities to improve the CVD process, achieving a safer software ecosystem in practice.more » « less
-
Data from Internet telescopes that monitor routed but unused IP address space has been the basis for myriad insights on malicious, unwanted, and unexpected behavior. However, service migration to cloud infrastructure and the increasing scarcity of IPv4 address space present serious challenges to traditional Internet telescopes. This paper describes DSCOPE, a cloud-based Internet telescope designed to be scalable and interactive. We describe the design and implementation of DSCOPE, which includes two major components. Collectors are deployed on cloud VMs, interact with incoming connection requests, and capture pcap traces. The data processing pipeline organizes, transforms, and archives the pcaps from deployed collectors for post-facto analysis. In comparing a sampling of DSCOPE’s collected traffic with that of a traditional telescope, we see a striking difference in both the quantity and phenomena of behavior targeting cloud systems, with up to 450× as much cloud-targeting as expected under random scanning. We also show that DSCOPE’s adaptive approach achieves impressive price performance: optimal yield of scanners on a given IP address is achieved in under 8 minutes of observation. Our results demonstrate that cloud-based telescopes achieve a significantly broader and more comprehensive perspective than traditional techniques.more » « less
-
Abstract—Signature-based Intrusion Detection Systems (SIDSs) are traditionally used to detect malicious activity in networks. A notable example of such a system is Snort, which compares network traffic against a series of rules that match known exploits. Current SIDS rules are designed to minimize the amount of legitimate traffic flagged incorrectly, reducing the burden on network administrators. However, different use cases than the traditional one–such as researchers studying trends or analyzing modified versions of known exploits–may require SIDSs to be less constrained in their operation. In this paper, we demonstrate that applying modifications to real-world SIDS rules allow for relaxing some constraints and characterizing the performance space of modified rules. We develop an iterative approach for exploring the space of modifications to SIDS rules. By taking the modifications that expand the ROC curve of performance and altering them further, we show how to modify rules in a directed manner. Using traffic collected and identified as benign or malicious from a cloud telescope, we find that the removal of a single component from SIDS rules has the largest impact on the performance space. Effectively modifying SIDS rules to reduce constraints can enable a broader range of detection for various objectives, from increased security to research purposes.more » « less
-
Adversarial examples, inputs designed to induce worst-case behavior in machine learning models, have been extensively studied over the past decade. Yet, our understanding of this phenomenon stems from a rather fragmented pool of knowledge; at present, there are a handful of attacks, each with disparate assumptions in threat models and incomparable definitions of optimality. In this paper, we propose a systematic approach to characterize worst-case (i.e., optimal) adversaries. We first introduce an extensible decomposition of attacks in adversarial machine learning by atomizing attack components into surfaces and travelers. With our decomposition, we enumerate over components to create 576 attacks (568 of which were previously unexplored). Next, we propose the Pareto Ensemble Attack (PEA): a theoretical attack that upper-bounds attack performance. With our new attacks, we measure performance relative to the PEA on: both robust and non-robust models, seven datasets, and three extended ℓp-based threat models incorporating compute costs, formalizing the Space of Adversarial Strategies. From our evaluation we find that attack performance to be highly contextual: the domain, model robustness, and threat model can have a profound influence on attack efficacy. Our investigation suggests that future studies measuring the security of machine learning should: (1) be contextualized to the domain & threat models, and (2) go beyond the handful of known attacks used today.more » « less
An official website of the United States government

Full Text Available